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Abstract —This paper is motivated by the observation that, 
in many cases, we do not need to serve specific messages, 
but rather, any message within a content-type. Content-type 
traffic pervades a host of applications today, ranging from 
search engines and recommender networks to newsfeeds and 
advertisement networks. The paper asks a novel question: if 
there are benefits in designing network and channel codes 
specifically tailored to content-type requests. It provides three 
examples of content-type formulations to argue that, indeed 
in some cases we can have significant such benefits. 

I. Introduction 

Coding traditionally aims to securely and efficiently con¬ 
vey specific information messages to one or more receivers. 
This broad aim encompasses most of the work in the 
field, from the channel coding theorem of Shannon [ 1 ], 
to recent breakthroughs in channel coding [ 2 ], [ 3 ], network 
coding [ 4 ], and index coding [ 5 ]. However, communication 
networks today are increasingly used to serve a fundamen¬ 
tally different traffic, that delivers type of content rather 
than specific messages. As a simple example, when we use 
the Internet to access our bank account, we ask and want 
to see very specific information. However, if we search for 
a photo of a humming bird, we do not care which specific 
humming bird photos we receive - we do not even know 
what humming bird pictures are available - we only care 
about the content type, that it is a humming bird and not 
an owl photo. 

Content-type traffic pervades a host of applications today. 
For instance, content-delivery networks, such as the Akamai 
network, in many cases do not need to satisfy message- 
specific requests, but instead content-type requests (eg., 
latest news on Greece, popular sites on CNN.com, etc); 
smart search engines and recommendation systems (eg. 
Google, Pandora) generate in the majority content-type 
traffic; advertisement networks (eg. serving adds on hotels 
or cars), and newsfeeds on social networks (eg., cultural 
trends, following celebrities) also fall in the content-type 
category. The fact that content forms a significant per¬ 
centage of the Internet traffic has been well recognized 
especially in the networking community; however, most of 
the work looks at what to replicate, where and how to store 
and from where to retrieve specific data. 

We ask a very different question: are there benefits in 
designing network and channel codes specifically tailored 
to content-type traffic? We make the case that, indeed, if 
we realize that we need to serve content-type rather than 


specific messages, we can in some cases achieve significant 
benefits. The fundamental reason content type coding helps, 
is that it offers an additional dimension of freedom: we 
can select which, within the content type, specific message 
to serve, to optimize different criteria. For instance, we 
can create more coding opportunities, transform multiple 
unicast to a multicast session, and adapt to random erasure 
patterns, all of which can lead to higher throughput. 

For different content-type coding problems, there are 
different mathematical representation. It is non-trivial to 
find a consistent mathematical model for a general content- 
type coding problem and it is beyond the scope of this 
paper. In this paper, we study the content-type coding 
problems by providing several examples to show the sig¬ 
nificant benefits of content-type coding over conventional 
message-specific coding. First, we provide an example of 
benefits over networks: we consider a classical example 
in the network coding literature, the combination network, 
and show the benefits in a content-type setup. Second, we 
provide an example of benefits over lossy networks: we 
consider a broadcast erasure channel with feedback setting, 
where a source wants to send messages to two receivers. 
Third, we review an example within the index coding setup, 
termed pliable coding, that we presented in a previous work, 
introduce an algebraic framework for pliable coding and use 
this framework to provide a novel lower bound. 

The paper is organized as follows. Section II considers 
content-type coding over the combination network; Sec¬ 
tion III looks at broadcast erasure networks; Section IV 
introduces an algebraic framework for pliable index coding; 
and Section V concludes the paper with a short discussion. 

II. Content-type coding over networks 

Motivating example: Fig. 1 provides a simple example 
of content-type coding benefits. We start from the classical 
butterfly network, with two sources, si and S2, two re¬ 
ceivers, rr and r<z, and unit capacity directed links, where 
each source has 2 messages of a given type. In the example, 
source sj (say advertising hotels) has 2 unit rate messages, 
{bn, 612}- and source S2 (say advertising car rentals) also 
has 2 unit rate messages {621,622}- The min-cut to each 
receiver is two, thus she used all the network resources 
by herself, we could send to her two specific messages, 
for instance, to rp. 6n and 621 and to 7*2: 612 and 622- 
However, if we want to send to both receivers these specific 
requests, it is clearly not possible, as there are no coding 
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(a) Requesting specific messages. (6) Requesting one message from each type. 
Fig. 1: Example where each source has messages of a given type. 


opportunities; we can satisfy one receiver but not both. 
In contrast, if each receiver requires one message from 
each type, then we can multicast say 6n and 621 to both 
receivers. 

Problem formulation: We consider a network repre¬ 
sented as a directed graph G = (V. E), a set of m sources 
{si, S2, • • • , s m } and a set of n receivers {n, • • • , r n }. 

Each source has u messages of the same type, and different 
sources have different types of messages. We denote by 
Bi = {bn,bi2, •'' ,biu} the set of type-i messages from 
source Sj. 

Performance metrics: Given a transmission scheme, 
we use the rate towards a receiver to measure the efficiency 
of the transmission scheme. In the conventional message- 
specific coding problem, each receiver r ? requests specific 
messages, one from each type, denoted by a fixed element 
(61,62,- •• , b m ) £ B 1 x ••• x B m . For T transmissions, 
the messages requested by receiver r 7 are denoted by 
(bT .bf, ■ ■ ■ . bT„), where each element is a vector bj = 
(6j(l), • • • , bi(t), ■ ■ ■ , bi(T)). 

We denote by 1Z 7 the set of messages that the receiver 
fj can decode after T transmissions, and we say that the 
transmission rate towards rj is the number of requested 
messages that can be decoded by the receiver rj per 
transmission, i.e., 

^ T m 

** = £££ 1 {b i {t)en j }■ ( 1 ) 

t=l i=1 


In the content-type formulation, each receiver rj requests 
to receive (any) one message from each type, and does not 
care which specific one. We denote the requested content- 
type messages by an arbitrary element xi,X2,--- ,x m £ 
B 1 x • • • x B rn . For T transmissions, the messages requested 
by rj are denoted by (xf,xj’,--- ,x^ n ), where each el¬ 
ement is a vector xf = (#,(£),••• ,Xj(t),--- ,Xj(T)). 
Similarly, the rate towards rj is defined as the number of 
requested messages that can be decoded by the receiver rj 
per transmission, i.e., 
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( 2 ) 


t=1 i=l 


We would like to study the rate averaged among all 
receivers for message-specific coding, denoted by R, and 
the that for content type coding, denoted by R c . Clearly, 
for message specific coding, the rate depends upon the 
specific message requests. We denote by R w the worst- 
case rate (minimizing among all possible sets of requests), 
and by R a the average rate (averaged over all possible sets 
of requests). We define the worst case and average case 
gains, respectively, as: 

G w = G a = fr. (3) 

A. Combination-like network 

We consider the combination network structure 
B(m, k,u). where m < k. shown in Fig. 2. The network 
has four layers: the first layer is the m sources and each 
source connects to every node in the second layer; the 
second layer has k intermediate A nodes and each A 
node connects to a B node in the third layer, which also 
contains k intermediate B nodes; the fourth layer contains 
n = ( k )u m receivers, where we have u m receivers 
connected to each subset of size m of the B nodes. 

Theorem 1: In a B(m,k,u) network, 

G w = u, limfc^oo G a = u. (4) 


The proof of this theorem follows from Femmas 1-3. 

Lemma 1: In the network B(m , k, u), content type cod¬ 
ing achieves R c = m. 

Proof: Use network coding to multicast one specific 
message from each type to all receivers. ■ 

Lemma 2: In the network B(m,k,u), the worst case 
message-specific coding rate is R w = m/u. 

Proof: We construct the following receiver requirements: 
for every u m receivers that are connected to the same B 
nodes, each request is an element of the set B\ x • • • x B m , 
and no two requests are the same. Since all mu messages 
are requested, we need to use the same set of m A-B edges, 
at least u times. Using network coding we can ensure that 
at the end of u transmissions, each receiver gets all mu 
messages and thus also the to messages required by her; 
thus the transmission rate is m/u. Note that receiving all 
mu messages by each receiver is a worst case scenario in 
terms of rate. ■ 

Lemma 3: In a network B(m,k,u ), the average rate of 
the message-specific coding problem is bounded by 


R a < m 




(m+l ) 2 Vln u 



(5) 


Proof: We here provide a short outline and give the 
complete proof in the appendix. We consider to. out of the 
k edges that connect A to B nodes, and the receivers 
that can be reached only through these edges. We call 
these to. edges and u m receivers a basic structure. We 
argue that, through each such structure, we need to send 
almost surely all messages to have a good average - with 
high probability all messages are required by less than 
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Fig. 2: Combination network structure. Fig. 3: A complete pliable coding instance. 



u m 1 (1 — < 5 i) receivers. Thus the rate we can get through 
each basic structure converges to m/u. ■ 

III. Content-type coding over erasure 

NETWORKS 

We here make the case that, over erasure networks with 
feedback, we can realize benefits by allowing the random 
erasures to dictate the specific messages within a content 
type that the receivers get - essentially we shape what we 
serve to the random channel realizations. 

We consider the following content-type coding setup. 
A server, s, has k\ messages of content-type 1 , ACi, and 
&2 messages of content-type 2 , AC2 (eg. an ad serving 
broadcasting station in a mall has k\ sale coupons for 
clothes and k.2 sale coupons for gadgets). Receiver c\ wants 
to receive all the ki sale coupons for clothes and some 
(a fraction a, any a/c 2 ) of the sale coupons for gadgets; 
receiver C2 wants the reverse, all the k-2 coupons for gadgets 
and some (any ak\) of the coupons for clothes. The server 
is connected to the two receivers through a broadcast 
erasure channel with state feedback; in particular, when the 
server broadcasts a message, each receiver c, receives it 
correctly with probability 1 — e,, independently across time 
and across receivers. Each receiver causally acknowledges 
whether she received the message or not. 

The corresponding message-specific scheme is as follows 
[6]. The server wants to send to c\ all the messages in ACi 
and in a specific subset of AC 2 , AC 3 , C AC2, of size afc 2 ; and 
to C2, all the messages in AC 2 and in a specific subset of 
/Ci, K\ C ACi, of size a/c 2 - We again have independent 
broadcast erasure channels with feedback. 

Rate region: We say that rates (n, f2), with 

k\ + ak 2 k 2 + aki 

n = -y-, r 2 =---, (6) 

are achievable, if with T transmissions by s both ci and C2 
receive all they require. 

Strategy for message-specific coding: The work in [6] 
proposes the following achievability strategy and proves 
it is capacity achieving. Recall that we use the subscript 
to indicate the content type, and the superscript for the 
receiver. 


• Phase 1 : The source repeatedly transmits each of the 
messages in K,\\IC\ and AC 2 \AC2, until one (any one) of the 
two receivers acknowledges it has received it. 

• Phase 2 : The source transmits linear combinations of the 
messages in /C Kf, those in /Ci\/Cf that were not received 
by Ci, and those in AC 2 \AC2 that were not received by C2. 

The intuition behind the algorithm is that, in Phase 1 , 
each private message (that only one receiver requests) is 
either received by its legitimate receiver, or, it becomes 
side information for the other receiver. In Phase 2 , each 
receiver either wants to receive each message in a linear 
combination or already has it as side information and can 
thus subtract it. The source creates linear combinations that 
are innovative (bring new information) to each receiver (eg., 
through uniform at random combining over a high enough 
field [6]). The strategy achieves the rate region: 

' 0 < n < min{ 1 - 61, 1 _ (1 _ 1 ^ ) 2 (1 _ a) } 

0 < r 2 < min{ 1, - e 2 , 1 _ (1 „ 1 ^ ) 1 (1 _ a) } 

* n 11 _ OL(j> 1 \ , £2 1 / 1 (7) 

1—€1 ^ 1 +a) ' 1—ei2 1+a — 

r 1 1 1 t-2 (1 _ atp2_\ -1 

1—€12 1+0 " T " 1—€ 2 V 1+0/ — 

where d 2 = eie 2 , and fa = (1 - e.j)/(1 - ei 2 ) for *=1,2. 

Strategy for content-type coding: We propose the 
following strategy. 

• Phase 1 : For the messages in /C1, denote by K,\ r the 
messages not yet transmitted. Initially K,± r = 1 C 1. 

1 ) The server repeatedly broadcasts a message in K,\ r , 
until (at least) one of the receivers acknowledges it 
has successfully received it. The message is removed 
from /Ci,-. If Ci receives the messages, she puts it into 
a queue Q\. If c 2 receives the message, she puts it 
into a queue Q\. 

2 ) The server continues with transmitting a next mes¬ 
sage in /C i r , until either K,\ r become empty, or 

\K lr \ + \Q 2 1 \=ak 1 . ( 8 ) 

The server follows the same procedure for message set /C2. 

• Phase 2 : The source transmits linear combinations of the 
messages in the three sets: /Ci r U/C2 r , Qi\<3i, and QWQ2 
until both receivers are satisfied. 





(a) Case 1: a < min{0i, 0 2 }; a = 0.5. (b) Case 2: 0i < o: < 02; a = 0.7. (c) Case 3: a > max{0i, 02}; ol = 0.85. 

Fig. 4: Comparison of rate region, as defined in (6), by message-specific and content-type coding, across three cases. The shaded regions show the gains of content-type 
over message-specific coding. The channel parameters are e± = 0.4 and €2 = 0.3, which give 0 1 = 0.682 and 0 2 = 0.795. 


The intuition behind the strategy is that, during Phase 1 
we transmit messages from K ,i until we either run out of 
messages, or both receivers want to receive the remaining 
1 C\ r : Ci because she wants all the messages in 1 C\ and c-i 
because, on top of the ()\ she has already received, she 
also needs the 1 C\ r to complete the fraction ak \. Note that 
originally, \lC\ r \ = k\ and \Qf\ = 0 ; at every step, the 
quantity in (8) either remains the same (if C2 receives the 
message), or reduces by one (if she does not). Similarly for 
K2- In the second phase, the source again transmits linear 
combinations of messages that either a receiver wants, or 
already has and can subtract to solve for what she wants. 

Using the above method, we can show the achievable 
rate of content-type broadcasting scheme in the following 
theorem. 

Theorem 2 : The rate region of the 1-2 content-type 
broadcasting communication with erasures is 

0 < 7T < min{l — ei, 

0 <r 2 < min{ 1 - e 2 , ±^-} 

< 7-1 n _ ct(0i-a) + l , r 2 (<f>i—a) + ^ 1 (9) 

1-ei L 1 1—a 2 1 1-ei 1-a 2 - 1 

r2 M _ a(<f>2~a ) + 1 , n (</>2~qQ + ^ ^ 

1 —€2 L 1—a 2 -I 1—62 1—a 2 — 

where (a;) + = max{x,0}. 

In fact, this achievable scheme achieves the capacity 
for 2-1 content-type broadcasting communication with era¬ 
sures. The proof of achievability and converse of Theorem 2 
is shown in Appendix B. 

Fig. 4 compares the rate regions for the content-type and 
message coding. For content-type, we have three distinct 
cases, depending on the relative values of a and (f>i. Note 
that cf>i expresses the fraction of messages that Ci receives 
during Phase 1 . Thus, if a < min{</>i, </> 2 } (Fig. 4 (a)), 
Ci and C2 already receive ok-\ and ak-> messages during 
Phase 1 ; essentially broadcasting content-type messages 
comes for “free”, has not additional rate cost to providing 
Ci with 1 C 1 and C2 with 1 C 2. If min{</>i, ^> 2 } < a < 
max{</>i, </> 2 }, say for instance </>i < a < </> 2 (Fig. 4 
(b)), C2 receives the content-type messages for free, but 


for ci we need additional transmissions in Phase 2 . In 
a > max{<)i, 0 2 } (Fig. 4 (c)), Ci and c 2 require large 
percentages of messages from another type; interestingly, 
when we have max{i)i, ^2} < a < min{0i/^ 2 j ^ 2 /^>i}, 
we can achieve the point (1 — ei, 1 — €2), which implies 
that, all transmissions by s are useful for both receivers. 
Message-specific coding in general does not achieve this 
point. 

IV. Content-type coding in the index coding 

FRAMEWORK 

Previous work has investigated a specific type of content- 
type coding within the framework of index coding, termed 
pliable index coding [ 7 ], [8]. In index coding we have a 
server with m messages, and n clients. Each client has as 
side-information a subset of messages, and requires one 
specific message. The goal is to minimize the number of 
broadcast transmissions so that each client receives the 
message she requested. In pliable index coding, the clients 
still have side information, but they no longer require a 
specific message: they are happy to receive any message 
they do not already have. The work in [ 7 ] [8] has shown 
that, although in the worst case, for index coding, we may 
require 0 (n) transmissions, for pliable index coding we 
require at most 0 ( log 2 n), i.e., we can be exponentially 
more efficient. This result was derived by using a bipartite 
representation of the pliable index coding problem and a 
randomized construction. 

In this paper, apart from drawing attention to the fact 
that pliable index coding is a form of content-type coding, 
we also make two contributions: we derive an algebraic 
condition for clients to be satisfied in pliable index coding, 
and use this condition to prove a new lower bound: we 
show that there exist pliable index coding instances where 
f2(logn) transmissions are necessary. 

Bipartite graph representation: We can represent a 
pliable index coding instance using an undirected bipartite 
graph, where on one side we have a vertex corresponding to 
each of the m messages, say b \, ..., b m and on the other 
side one vertex corresponding to each client, ci,...,c n . 































We connect with edges clients to the messages they do not 
have. For instance, in Fig. 3 , client C5 does not have (and 
would be happy to receive any of) b\ and &2- 

A. An algebraic criterion for pliable index coding 

Assume that the source broadcasts K linear combinations 
of the m messages over a finite field F q \ that is, the 
fc-th transmission is 0^,161 + 0^.262 + • • • + cxk,mb m , 
with a,.j £ F g . We can collect the K transmissions into 
a K x m coding matrix A, where row k contains the 
linear coefficients (a^i, oik,2, *• ■ , ctfe.m) used for the k- 
th transmission. We also denote by a, the i-th column 
vector of the matrix A. Then each client receives Ab = c, 
where b is the to x 1 vector that contains the messages 
and c is a constant vector, and needs using this and his 
side information to recover one message he does not have. 

For client c :l , let us denote by N[j] the set of indices of 
messages that Cj does not have, and by Nq [j] the set of 
indices of cf s side information. For example, A’[ 2 ] = { 1 } 
and Ay;[ 2 ] = { 2 , 3 } for client C2 in the example in Fig. 3 . 
Clearly, client c, can use his side information to remove 
from the source transmissions the messages in Nc[j]', it 
thus can recover from matrix A the submatrix Ajvyj which 
only contains the columns of A with indices in A[j], That 
is, if b N y] contains the messages he does not have, and c' 
is a constant vector, he needs to solve 

A JV[j]bjv[j] = c', (10) 

to retrieve any one message he does not have. 

Lemma 4 : In pliable index coding, client c :i is satisfied 
by the coding matrix A if and only if there exists a column 
ati, with 

a, £ span{Ajv[j]\{j}}, for some i £ N[j], ( 11 ) 

where span{A JV[j] \ {i} } = (Et e iv[j]\{i} A;a ; |A ; e F q , 
oli £ F 9 a } is the linear space spanned by columns of 
Ajvm other than c*;. 

Proof: We are going to argue that, if such a column 
a.i exists, then client c t can uniquely decode /;, he does 
not have from his observations, and thus is satisfied. The 
condition ct; ^ span{Ajv[j]\{;}} implies that any vector 
in the null-space A f(Aj^yi) has a zero value in position 
i; indeed, since column ct; is not in the span of the other 
columns, then for every vector x £ Af{A^[j]), the only 
way we can have E/siVb] x l a i = 0 is for x ; = 0 . But the 
solution space of any linear equation can be expressed as a 
specific solution plus any vector in the nullspace; and thus 
from (10) we can get a unique solution for h, if and only if 
any vector x in A/^Ajvm) has a zero value in the element 
corresponding to i, as is our case. We can retrieve b, by 
column and row transformations in (10). ■ 

B. A Lower Bound 

Theorem 3 : There exist pliable index coding instances 
that require fl(log(n)) broadcast transmissions. 


Proof: We constructively prove this theorem by provid¬ 
ing a specific such instance, that we term the complete 
instance. In the complete instance, we have a client for 
every possible side information set corresponding to a 
client. In this case, the client vertex set A corresponds to 
the power set 2 B of the message vertex set B, and we have 
m = log 2 (n) (note that we can have an arbitrary number 
of additional messages and assume that all clients already 
have these). We define W r j (A) to be the set of client vertices 
that have a degree d(d = 0, 1 , • • • , m ), i.e., the vertices that 
need d messages. An example of the complete instance with 
m = 3 is show in Fig. 3 . Obviously, we can trivially satisfy 
all clients with to = log 2 (n) transmissions, where each &; 
is sequentially transmitted once. We next argue that we 
cannot do better. 

We will use induction to prove that the rank of the coding 
matrix A needs to be at least m for the receivers to be 
satisfied according to Lemma 4 . Let J denote an index 
subset of the columns; in the complete model. Lemma 4 
needs to hold for any subset J. For \J\ = 1 , i.e., to satisfy 
the clients who miss only one message, no column of the 
coding matrix A can be zero. Otherwise, if for example, 
column i\ is zero, then the client who requests message b ,, 
cannot be satisfied. So rank(Aj) = 1 for | J| = 1 . Similarly, 
for | J\ = 2 , any two columns of the coding matrix must be 
linearly independent. Otherwise, if for example, columns i\ 
and '<2 are linearly dependent, then a,, £ span { n ., 2 } and 
ati 2 £ span{tt (j }, and the clients who only miss messages 
bi x and 6; 2 cannot be satisfied. So rank(Aj) = 2 . 

Suppose we have rank(A./) = l for \J\ = l. For \J\ = 
l + 1, we can see that if all clients who only miss l + 1 
messages can be satisfied, then for some i £ J, we have 
ai f. span{Therefore, rank(Aj) = rank(o;;) + 
rank(Aj\r;}) = 1 + 1 . Therefore, to satisfy all the clients, 
the rank of the coding matrix a is to, resulting in K > to, 
from which the result follows. ■ 

V. Conclusions and short discussion 

This paper introduced a new problem formulation, that 
we termed content-type coding. Although there is signifi¬ 
cant work in content-distribution networks, the work still 
considers message-specific requests, where now multiple 
requests are interested in the same message. The research 
questions are focused mostly on where to store the content, 
how much to replicate, how much to compress, and what 
networks to use to upload and download content. There is 
also a rich literature in network coding and index coding, 
yet as far as we know, apart from the pliable index coding 
formulation, there has not been work that has looked at 
content-type coding. 

We discussed in this paper three examples, where if 
we realize that we need to serve content-type rather than 
specific messages, we can have significant benefits. We 
believe that there are many more scenarios where we can 
realize benefits, as, downloading content-type rather than 
message-specific content, can help all aspects of content 



distribution networks, ranging from storage to coding to 
delivery. Even in the specific formulations we proposed 
in this paper, there are several followup directions and 
extensions, for instance looking at more than two users and 
multihop communication over erasure networks. 
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Appendix A 
Proof of Lemma 3 


structure. From the Chernoff bound, we have that 

Pr{^} = Pr {s i3 > E[ StJ }( 1 + S,)} < 

( 12 ) 

and 

u m-l s 2 

Pi = PrlE^ 1 } = um Vi{Efj} < ume s “. (13) 

Here we denote by pi the probability that an abnormal 
event happens. When an abnormal event happens for this 
basic structure, the rate for this structure is (at most) m. If 
an abnormal event does not happen for this basic structure, 
the rate for this structure is (at most) mu m ~ 1 (l+6i)/u m = 
m{ 1 + 8\)/u. 

Next, we consider the v = ( k ) structures. Let us 
denote by Te the number of structures with an abnormal 
event happening. The expected value of Te is vp\. The 
probability that {Te > vpi(l + ^2)} happens, denoted by 
P 2 , can be bounded using the Chernoff bound: 

Pi = Vv{T e > vpi(l + <5 2 )} < e £ 3 -2 ', (14) 


Hence, if the event {Te > upi(l + 52 )} does not happen, 
the number of structures with an abnormal event is at most 
vpi{\ + 82)- Therefore, the average rate can be bounded by 


no ^ 1 \ r«Pl(l+i?2 ) m +( v -fpi (l+<5 2 )) m(1 +ai) ■ 

R < P2m + (1 -P2)[ - 5 - 


< P2m. +Pi(l + 62)171 + m(1 ^ <5l ) _ 

Let us set i 5 i = t/2 anc j ^ 

U 2 

Then we have 

_ m + 1 

Pi < mu 2 ? 

_ m±l 

P2 < U 2 . 



Let us denote by £ = {e±, e2, ■ ■ ■ , efa the set of k edges 
connecting A nodes and B nodes. We refer to m different 
edges in 8 and the v rn receivers that are only connected to 
them, as a basic structure. From construction, there are ( k ) 
such structures. For each structure, it is straightforward to 
see that: 

• For any basic structure, the maximum transmission 
rate through these m edges to the receivers in this 
structure is m (since the min-cut is to). 

• Denote by s tl the number of requirements of mes¬ 
sage b.,j for receivers in a basic structure (i.e., the 
number of receivers requesting this message). Then 
the maximum transmission rate is YJiLi Sij/u m , 
where max; s t j represents the Z-th maximum number 
among {sjjll < i < to, 1 < j < u}. This means 
that the maximum transmission rate is achieved by 
transmitting the to most popular messages through 
these to. edges. 

Consider any given basic structure. We first observe 
that E[s-ij] = u m ~\ We define the event E{\- = 

{s^ > E[sij}{ 1 + < 5 i)} as an abnormal event with respect 
to the message b tJ . We also define the event E Sl = 
Ui <i<m,\<j<uEi 1 j as an abnormal event for this basic 


Plugging e.q. ( 16 ) into e.q. ( 15 ), we can have an upper- 
bound for the average rate: 

R a < P2m+pi{l + 62)171. + 

^ m 1 m 1 ra 2 (1 + Cln u ) , ( m + lpVlnu ( 17 ) 

^ TT ^-ttTTT i--1- ; . 


Setting k = him., and to = /12TO where hi and /12 are 
constants, we have. 


pa , m 


as k —> 00. 


Appendix B 
Proof of Theorem 2 


A. Achievability 

To prove this theorem, we assume that ki and are 
large. Recall that we define ei2 = £162. and fa = (1 — 
£*)/( 1 — £12) for i = 1 , 2 . Let us denote by k[ and k' 2 
the number of messages transmitted from sets K, 1 and K,2 
at the end of phase 1 . Therefore, the average number of 
transmissions needed to complete phase 1 is: 


Ni 


k’ 1 + k’ 2 

1 — £12 


(19) 




p(y l P!k\ x )- vivi I x )p(y 2 1 *) p(2/i I x) 


x, wp 1 -q 
e, wp e. 


Fig. 5: Content type coding model for the broadcast erasure channel with feedback. 


On average, the number of messages from set KLj (j = 
1, 2) received by receiver i (i = 1, 2) is: 

M) = k'jfc. (20) 

From the algorithm, we know that k\ — k[ = (ak± — 
Mf) + and k 2 — k' 2 = (ak 2 — M 2 ) + . Therefore, we have 

(a-y ^ (21) 

1-(pi 

In phase 2, the required number of messages for receiver 
i (i = 1,2) is then 

K = (K - M\) + (h - k[) + (k 2 - k 2 ), (22) 

where the first term is the number of erased messages 
from the set JCi that are received by another receiver, the 
second and third terms are the remaining messages to be 
transmitted. 

For phase 2, the average number of transmissions needed 

iS , 1,2 

iV 2 = max{-^_,-^-}. (23) 

1 — ei 1 — e 2 

Then, the rate region can be calculated as: 

{(ri,r 2 ):r 1 >0,r 2 >0,r 1 = ^±^, 

r 2 = , Ni + N 2 < T} , 

where T is an auxiliary variable and can be cancelled out. 
Plugging (19) and (23) into (24) , we get the theorem. 

Note that for max{</>i, </> 2 } < a < min{</>i/</> 2 , ^> 2 /</>i}, 
the conditions are simplified as r\ < 1 — e\ and r 2 < 1 —e 2 , 
implying that the maximum rates can be achieved. 

B. Converse 

To prove the converse of the theorem, we use an in¬ 
formation theory method to show that this rate region is 
tight. We first depict the system model in Fig. 5. Recall 
that we denote by s the transmitter, and K,± and /C 2 the 
sets of two types of messages. We denote by ki and 
k 2 the cardinalities of KL\ and /C 2 . Two receivers request 
information messages from the server s. Receiver 1 requests 
all information messages from C. \ and a percent of the 
information messages from /C 2 . In contrast, receiver 2 
requests all information messages from /C 2 and a percent of 
the information messages from K,\. Note that in our content 
type coding problem, receiver 1 is satisfied as long as she 


successfully receives ak 2 messages from K 2 , disregarding 
of which ones, so is the same for receiver 2. 

We consider broadcasting over an erasure channel with 
feedback. In this channel, when a message x is transmitted, 
the receiver i has a probability of 1 — e, to receive the 
message correctly and has a probability of <=, to receive 
nothing (i.e., the message is erased for receiver i). A 
message is received or erased independently for receivers 
1 and 2. We aim to find the capacity region (ri,r 2 ) of for 
this broadcast channel. Without loss of generality, let us 
assume |Wi| = \W 2 \ = \X\ = |Yi| - 1 = \Y 2 \ - 1 = 2. 

To prove the converse, we just need to show the first 
and the third equations in (9), and then according to the 
symmetry, we get the whole set of equations. For the first 
equation, it is equivalent to point-to-point communication, 
so we can directly get it. For the third equation, we consider 
two parts: 

+ _h_ + _^< i, 

1 - Cl 1 - Cl “ ’ 1-612 1 - 6 2 - 

First, we consider 

n 

n > E H{X t ) 

t= 1 
n 

> E HiXtlYf-^S*- 1 ) 

£—1 

n 

= '£mX t \Y*- 1 ,Y*- 1 ,S t - 1 ) 

4=1 1251 

+ I(Xf,Yt 1 \Y{-\S t ~ 1 ) 

n 

= E [H{X t IPPf 1 , , Yt 1 , Yt 1 ,5*- 1 ) 

t ~+I(X t ;Y*- 1 \Y*-\S t - 1 ) 

+ I(X t ; Wf 1 , IE*" 1 , Yt 1 , S*- 1 )] 

and the following two conditions: 


(fci + k 2 ) - ne n < I(W* , ; F", Y?, S n ) 

n 

= E HY^Y^SuWfWVtlYt 1 ^- 1 ,#- 1 ) 

t= 1 
n 

= E KXu, Y 2 ,u W-1 1 , wt IE 4 - 1 , Yt\ S*- 1 , St) 

t= 1 
n 

= E I(X t ; , W 2 k2 IF/" 1 , Yt 1 , S*- 1 ) Pr(S t ^ 0) 

t= 1 

n 

= (1 - ei 2 ) E I{Xt\ W kl , W^" 1 1 F / -1 , F 2 t_ 1 , S'* -1 ), 


t—l 

( 26 ) 



























where the above conditions hold due to the Fano’s inequal¬ 
ity and the incidence property of S t , and 

<I(W^ 1 -,X t \Y^- 1 ,S t ~ 1 ) 

n 

+ I(Xf,Y*- 1 \Y*~ 1 ,S t - 1 )] 

n 

< t^ij + E[/(x t; y 2 ‘- 1 |y 1 t - 1 ,5 t - 1 )] ) 

where the last inequality follows from (using the same idea 
as (26)) 

ki > I(W kl ; Y", Y 2 ra , S n ) 

n (28) 

t =1 

Plugging (26) and (27) into (25), we get the first part of 
the third equation in (9). 

Similarly, we can get the second part of the third equation 
in (9) using 

(fci + ak 2 ) - ne n < I(Wf 1 , W? k2 ; Yf, S n ) 

n 

= (1 - d) E 

(29) 

< (1 - ci) E H{X t ) 

t= 1 

< (1 - ei )n. 



